103 research outputs found

    Multi-task Representation Learning for Pure Exploration in Linear Bandits

    Full text link
    Despite the recent success of representation learning in sequential decision making, the study of the pure exploration scenario (i.e., identify the best option and minimize the sample complexity) is still limited. In this paper, we study multi-task representation learning for best arm identification in linear bandits (RepBAI-LB) and best policy identification in contextual linear bandits (RepBPI-CLB), two popular pure exploration settings with wide applications, e.g., clinical trials and web content optimization. In these two problems, all tasks share a common low-dimensional linear representation, and our goal is to leverage this feature to accelerate the best arm (policy) identification process for all tasks. For these problems, we design computationally and sample efficient algorithms DouExpDes and C-DouExpDes, which perform double experimental designs to plan optimal sample allocations for learning the global representation. We show that by learning the common representation among tasks, our sample complexity is significantly better than that of the native approach which solves tasks independently. To the best of our knowledge, this is the first work to demonstrate the benefits of representation learning for multi-task pure exploration

    Provably Safe Reinforcement Learning with Step-wise Violation Constraints

    Full text link
    In this paper, we investigate a novel safe reinforcement learning problem with step-wise violation constraints. Our problem differs from existing works in that we consider stricter step-wise violation constraints and do not assume the existence of safe actions, making our formulation more suitable for safety-critical applications which need to ensure safety in all decision steps and may not always possess safe actions, e.g., robot control and autonomous driving. We propose a novel algorithm SUCBVI, which guarantees O~(ST)\widetilde{O}(\sqrt{ST}) step-wise violation and O~(H3SAT)\widetilde{O}(\sqrt{H^3SAT}) regret. Lower bounds are provided to validate the optimality in both violation and regret performance with respect to SS and TT. Moreover, we further study a novel safe reward-free exploration problem with step-wise violation constraints. For this problem, we design an (ε,δ)(\varepsilon,\delta)-PAC algorithm SRF-UCRL, which achieves nearly state-of-the-art sample complexity O~((S2AH2ε+H4SAε2)(log(1δ)+S))\widetilde{O}((\frac{S^2AH^2}{\varepsilon}+\frac{H^4SA}{\varepsilon^2})(\log(\frac{1}{\delta})+S)), and guarantees O~(ST)\widetilde{O}(\sqrt{ST}) violation during the exploration. The experimental results demonstrate the superiority of our algorithms in safety performance, and corroborate our theoretical results

    EXPERIMENTAL RESEARCH ON THE VIBRATION CHARACTERISTICS OF BRIDGE'S HORIZONTAL ROTATION SYSTEM

    Get PDF
    As a new construction method, the bridge horizontal rotation construction method can reduce the impact of traffic under the bridge. During the horizontal rotation of the bridge, the overall structure will inevitably lead to a vibration response due to the construction error of the contact surface of the spherical hinge. Due to the large weight of the structure and the longer cantilever of the superstructure, the vibration at the spherical hinge will be amplified at the girder end, which will adversely affect the stability of the structure. Taking a 10,000-ton rotating bridge as a reference, a scaled model was made to test the vibration of the girder during the rotating process of the horizontal rotating system.And by analyzing the frequency domain curve of girder vibration and the results of simulation calculation, it is found that the vertical vibration displacement response is related to the first three modes of longitudinal bending of the girder structure, but has nothing to do with the higher modes or other modes. Applying the harmonic response analysis module in ANSYS software method, it is proposed that the structural vibration effect will reach the smallest by controlling the rotating speed in order to control the excitation frequency within the first-order mode frequency of girder. Also in this research, the expression of the relationship between the vertical vibration velocity and acceleration of the girder end of the horizontal rotation system and the vibration frequency of the girder is established. Based on that, it is proposed that the stability of the horizontal rotation can be predicted by monitoring the vertical velocity and acceleration of the cantilever girder end during the horizontal rotation

    MECHANICAL BEHAVIOR OF HORIZONTAL SWIVEL SYSTEM WITH UHPC SPHERICAL HINGE UNDER SEISMIC ACTION

    Get PDF
    In the process of rotation, the total weight of the bridge structure is jointly supported by the spherical hinge and the supporting structure, and its lateral stability is poor. It is easy to lose stability under the action of dynamic loads such as seismic action effect. The present paper takes a 10,000-ton continuous rigid frame swivel bridge as the re-search object, analyzes the dynamic response of the seismic action to the horizontal swivel system, and establishes several structure simulation models. Eighteen seismic waves in three directions that meet the calculation requirements are screened for time history analysis and compared with the response spectrum method. Finally, an optimization algorithm for the seismic response of the bridge under horizontal swivel system is proposed based on the mode superposition method. The UHPC spherical hinge bears all the vertical forces and 20% of the bending moment caused by the seismic action, the support structure bearing the remaining 80% of the bending moment. The optimization algorithm proposed in this paper features high accuracy

    Tissue distribution and excretion of the five components of Portulaca oleracea L. extract in rat assessed by UHPLC

    Get PDF
    The aim of the present study was to investigate the tissue distribution and excretion of five components of Portulaca oleracea L. extract (POE) in rat following oral administration. A rapid, sensitive and specific ultra-high performance liquid chromatography (UHPLC) method with puerarin as the internal standard was used for the quantitative analysis of five components of POE, including caffeic acid (CA), p-coumaric acid (p-CA), ferulic acid (FA), quercitrin (QUER) and hesperidin (HP) in rat tissues including the liver, intestine, stomach, muscle, heart, lung, brain, kidney and spleen, urine and feces. The results show that onlyp-CA and FA were found in nearly all tissues with low cumulative ratios, and CA was higher in the intestine and stomach with a slightly higher cumulative ratio in the urine and feces after 24 h. HP and QUER were found at low levels in the tissues with low cumulative ratios.O objetivo do presente estudo foi investigar a distribuição tecidual e excreção de cinco componentes de extrato Portulaca oleracea L. (POE) em ratos após administração oral. Um método analítico rápido, sensível e específico para quantificação de cinco componentes de POE (ácido cafeico (CA), ácidop-cumárico (p-CA), ácido ferúlico (FA), quercitrina (QUER) e hesperidina (HP)) por cromatografia líquida de ultra eficiência (UHPLC), empregando puerarina como padrão interno de referência. Os compostos foram quantificados em diferentes tecidos dos animais, sendo eles fígado, intestino, estômago, músculo, coração, pulmão, cérebro, rim e baço, urina e fezes. Os resultados mostraram que apenas p-CA e FA foram encontradas em todos os tecidos com baixas taxas cumulativas e CA apresentou níveis mais altos no intestino e estômago com a taxa cumulativa um pouco mais elevada na urina e nas fezes após 24 h. HP e QUER apresentaram baixas concentrações nos tecidos com baixas taxas cumulativas

    Synthesis of a magnetic π-extended carbon nanosolenoid with Riemann surfaces

    Full text link
    Riemann surfaces are deformed versions of the complex plane in mathematics. Locally they look like patches of the complex plane, but globally, the topology may deviate from a plane. Nanostructured graphitic carbon materials resembling a Riemann surface with helicoid topology are predicted to have interesting electronic and photonic properties. However, fabrication of such processable and large π-extended nanographene systems has remained a major challenge. Here, we report a bottom-up synthesis of a metal-free carbon nanosolenoid (CNS) material with a low optical bandgap of 1.97 eV. The synthesis procedure is rapid and possible on the gram scale. The helical molecular structure of CNS can be observed by direct low-dose high-resolution imaging, using integrated differential phase contrast scanning transmission electron microscopy. Magnetic susceptibility measurements show paramagnetism with a high spin density for CNS. Such a π-conjugated CNS allows for the detailed study of its physical properties and may form the base of the development of electronic and spintronic devices containing CNS species
    corecore